chore: align sandbox tooling and policies with upstream OpenShell#24
Merged
chore: align sandbox tooling and policies with upstream OpenShell#24
Conversation
Add missing coding agents (Claude CLI, OpenCode, Codex), pin versions for reproducibility (Node.js 22.22.1, npm 11.11.0, uv 0.10.8), create a writable /sandbox/.venv overlay, set PATH/VIRTUAL_ENV/UV_PYTHON_INSTALL_DIR env vars, and bake the GitHub REST-only skill into the base image. Align openclaw and nemoclaw network policies with upstream: add pypi, cursor, and opencode policies; fix vscode wildcard endpoints that silently fail with OPA exact-match; replace hardcoded repo-specific write rules with generic read-only access; normalize policy names to use hyphens.
…nd clean up policies - Remove deadsnakes PPA, apt Python packages, and pip bootstrap; let uv manage the full Python 3.13 toolchain - Merge Node.js install + npm upgrade into a single RUN layer - Merge all npm global installs (vuln fixes + CLI tools) into one call - Add uv cache clean after python install and venv creation - Copy base policy.yaml into the image instead of just creating the dir - Remove duplicate UV_PYTHON_INSTALL_DIR ENV and redundant mkdir - Update syntax directive from dockerfile:1.4 to dockerfile:1 - Revert nemoclaw/openclaw policies to main and replace repo-specific rules (johntmyers/alpha-claw, bravo-claw) with generic placeholders
f89cc63 to
0d6d027
Compare
drew
added a commit
that referenced
this pull request
Mar 13, 2026
The cluster_pods allowed_ips policy was accidentally removed in #24. This policy allows sandbox binaries to reach services on the k3s cluster pod network (10.42.0.0/16), which is required for internal service communication.
drew
added a commit
that referenced
this pull request
Mar 13, 2026
The cluster_pods allowed_ips policy was accidentally removed in #24. This policy allows sandbox binaries to reach services on the k3s cluster pod network (10.42.0.0/16), which is required for internal service communication.
factory-octavian
pushed a commit
to factory-octavian/OpenShell-Community
that referenced
this pull request
Apr 1, 2026
…ons (!17) Closes NVIDIA#24 ## Problem SSH sessions into the sandbox were **not entering the network namespace** or receiving proxy environment variables. This meant every command run via SSH (the only user-facing path) had unrestricted internet access, completely bypassing OPA network policy enforcement. The root cause: the SSH server was started **before** the network namespace and proxy were created in `lib.rs`, so it never received the netns fd or proxy URL. Additionally, **gRPC inference from within the sandbox did not work at all** — even after fixing the netns, multiple issues prevented the Python SDK from reaching the navigator server through the CONNECT proxy. ## Changes ### Sandbox binary **Core fix — reorder startup + thread netns through SSH:** - `lib.rs`: Move netns + proxy creation before SSH server start. Compute `ssh_netns_fd` and `ssh_proxy_url`, pass them to `run_ssh_server()`. - `ssh.rs`: Thread `netns_fd` and `proxy_url` through the full SSH call chain into `spawn_pty_shell()`. Set proxy env vars on the shell command. Call `setns(fd, CLONE_NEWNET)` in `install_pre_exec()`. **Proxy — control plane allowlist + IPv6 socket lookup:** - `proxy.rs`: Connections to the navigator endpoint (derived from `NAVIGATOR_ENDPOINT`) are always allowed without OPA evaluation, logged with `engine=control_plane`. This is infrastructure the sandbox needs to function, not a user-configurable policy. - `procfs.rs`: Extended `parse_proc_net_tcp` to also check `/proc/<pid>/net/tcp6`. gRPC C-core uses `AF_INET6` sockets even for IPv4 connections, so its TCP entries were invisible to the proxy's identity resolver. Also fixed port parsing to use `rsplit_once(':')` for correct IPv6 address handling. **Proxy env vars — lowercase variants for gRPC C-core:** - `process.rs` + `ssh.rs`: Added lowercase `http_proxy`, `https_proxy`, `grpc_proxy` alongside uppercase. gRPC C-core (libgrpc) checks lowercase first and was ignoring the uppercase-only vars. ### Server - `sandbox/mod.rs`: Added `CAP_SYS_PTRACE` to sandbox pod security context. Required for the proxy (root) to read `/proc/<pid>/fd/` of sandbox-user processes for binary identity resolution. ### Python SDK - `inference.py`: Strip `http://`/`https://` scheme from endpoint before passing to `grpc.insecure_channel()`, which expects `host:port`. Default `endpoint` and `sandbox_id` from `NAVIGATOR_ENDPOINT` / `NAVIGATOR_SANDBOX_ID` env vars so `Inference()` works inside sandboxes with no arguments. ### Build / infra - `ci.toml`: Added `--cap-add=SYS_PTRACE` to `mise run sandbox` to mirror the k8s pod capabilities. ### Documentation - `architecture/sandbox.md`: Documented `CAP_SYS_PTRACE` requirement and the full set of proxy env vars (uppercase + lowercase). ## Testing All 29 sandbox unit tests pass. `mise run pre-commit` passes (fmt, clippy, all workspace tests, python lint). E2E verified on a live cluster: | Test | Result | |------|--------| | Proxy env vars (6 vars, upper+lowercase) | PASS | | Blocked endpoints (google, anthropic via curl) | PASS — all denied | | **gRPC inference from SSH session** (`Inference()` with no args, env var defaults) | **PASS** | | Proxy log: navigator requests show `engine=control_plane` | PASS | | Proxy log: blocked requests show `engine=opa` with correct deny reasons | PASS |
factory-octavian
pushed a commit
to factory-octavian/OpenShell-Community
that referenced
this pull request
Apr 1, 2026
…tion (#145) * fix(server): add field-level size limits to sandbox and provider creation Closes NVIDIA#24 Add validate_sandbox_spec and provider field validation with named constants. Configure explicit 1MB tonic max_decoding_message_size. Inference routes excluded per #133 rearchitecture. * chore: remove issue number references from code comments --------- Co-authored-by: John Myers <johntmyers@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Aligns the community base sandbox container and network policies with the upstream
NVIDIA/OpenShellsandbox that is being removed fromdeploy/docker/sandbox/. After this change, all tools and policies from the upstream sandbox are present in the community repo.Base Dockerfile (
sandboxes/base/Dockerfile)1.2.18), Codex CLI (0.111.0)22.22.1, npm11.11.0, uv0.10.8python3.13-devpackage (needed for native extensions)@hono/node-server@1.19.11transitive vulnerability fix (GHSA-wc8c-qw6v-h7f6)/sandbox/.venvoverlay with--system-site-packagesso sandbox users canpip installPATH,VIRTUAL_ENV,UV_PYTHON_INSTALL_DIRin both Dockerfile ENV and.bashrcskills/github/SKILL.md) — REST-onlyghCLI usage guide/sandbox/.claude/skills/with symlinks into.agents/skills/for agent discoveryNetwork Policies (
openclaw/policy.yaml,nemoclaw/policy.yaml)pypi(pip/uv package installs),cursor(Cursor IDE),opencode(OpenCode CLI)vscodewildcard endpoints — replaced*.vo.msecnd.netand*.gallerycdn.vsassets.iowith exact hosts (az764295.vo.msecnd.net,gallerycdn.vsassets.io) since OPA uses exact host matchingjohntmyers/alpha-clawandjohntmyers/bravo-clawwrite rules with generic read-only access matching upstreamgithub→github_ssh_over_https,nvidia→nvidia_inference,github_repos→github_rest_apiclaude_code→claude-code)gitlab,nvidia_web,cluster_pods,inferenceUpstream policy coverage
All 8 upstream network policies are now present:
claude_code,github_ssh_over_https,nvidia_inference,github_rest_api,pypi,vscode,cursor,opencode.